Introduce multi-slice NuGraph2 inference and filtering#868
Introduce multi-slice NuGraph2 inference and filtering#868
Conversation
Tag for integration release
There was a problem hiding this comment.
A few questions:
- information and attributes of the "spacepoint" are referred to the output of cluster3d step, correct?
- the energyDepNtuple includes only true info (MC)?
- can you clarify the comment at L=311? is this referred to the possibility of multiple v interactions in the same event?
Also, maybe I'm misunderstanding, but if the input in terms of spacepoints is cluster3d and this has the mythical points setting (2 planes are sufficient to find a match in 3d) isn't the case at L=360 indicated as an error possibly happening?
L417: what does this energy deposit map include? I understand this option is not available for ICARUS, but I was curious to understand. Never mind, I responded myself this is something used in uB.
L447-9: I'm not sure I'm following. It is said negative indicates no associated particle and thus one should search for a possible match track id <--> particle before taking the abs, but when doing this check the abs is done, is this wanted? I went to check a bit the TrackIdToParticle_P function but I'm not expert on how this operates so want to make sure this is what was intended here.
There was a problem hiding this comment.
L459: I'm not sure I'm getting "invisible" particles, why invisible?
L523: maybe it's not an issue, but here a vector of float is declared whereas the optical flash has a vector of double for the photoelectrons per channel (https://github.com/SBNSoftware/lardataobj/blob/7c9c4f84cddf3692803f92e06ae17f4dcc7e0cb0/lardataobj/RecoBase/OpFlash.h#L88)
Especially on this part related to optical flashes and in genera PMT-related reconstruction, I must say I'm not an expert so I trust your choices, but please do consider contacting an expert if needed.
There was a problem hiding this comment.
Sorry for the late replies!
information and attributes of the "spacepoint" are referred to the output of cluster3d step, correct?
Yes and no. The input label in the fcl is "nuslhits" (i.e. ICARUSTrueNuSliceHitsProducer) which takes cluster3d spacepoints and filters them based on some conditions (matching hits in a slice, ==3 associated hits, ...)
the energyDepNtuple includes only true info (MC)?
Yes
can you clarify the comment at L=311? is this referred to the possibility of multiple v interactions in the same event?
Yes
Also, maybe I'm misunderstanding, but if the input in terms of spacepoints is cluster3d and this has the mythical points setting (2 planes are sufficient to find a match in 3d) isn't the case at L=360 indicated as an error possibly happening?
See comment above about filtering in ICARUSTrueNuSliceHitsProducer
L417: what does this energy deposit map include? I understand this option is not available for ICARUS, but I was curious to understand. Never mind, I responded myself this is something used in uB.
Yes, it's a MicroBooNE tool
L447-9: I'm not sure I'm following. It is said negative indicates no associated particle and thus one should search for a possible match track id <--> particle before taking the abs, but when doing this check the abs is done, is this wanted? I went to check a bit the TrackIdToParticle_P function but I'm not expert on how this operates so want to make sure this is what was intended here.
It's kind of complicated here. I think G4 changes the ID to negative for particles that are present but that have some information dropped. We do not use the dropped information in the HDF5 file so we can revert the sign to positive. This implies that negative values in the HDF5 are only the default values, which are kept only if no particle is found.
L459: I'm not sure I'm getting "invisible" particles, why invisible?
It means neutral particles, which are not visible in the TPC (util they e.g. pair produce, but then they turn into other particles)
L523: maybe it's not an issue, but here a vector of float is declared whereas the optical flash has a vector of double for the photoelectrons per channel (https://github.com/SBNSoftware/lardataobj/blob/7c9c4f84cddf3692803f92e06ae17f4dcc7e0cb0/lardataobj/RecoBase/OpFlash.h#L88) Especially on this part related to optical flashes and in genera PMT-related reconstruction, I must say I'm not an expert so I trust your choices, but please do consider contacting an expert if needed.
It will automatically convert the doubles to floats when filling the vector. We don't need full precision here and choose to save disk space.
There was a problem hiding this comment.
Thanks a lot @cerati for all the details, this helped a lot following the changes done and the logic.
I have only a remaining comment but just for my understanding:
on the spacepoints the conclusion then is that, even if mythical points are applied here (as part of cluster3d) and thus in principle only 2 planes could be used to build them, because of the way nu-graph operates, namely the need to build triangle(s) and thus 3 hits at least, the request is to have at least 3 hits/3 planes no matter what, correct?
There was a problem hiding this comment.
Minor comment on L=123: if one empty plane is sufficient for you to skip the slice you can exit the loop over planes whenever one is found.
Question: my understanding is this module generalises the previous one (based on uB) to act on more slices simultaneously, but only the slices w/ all the planes filled (which I guess implies a minimum number of hits?) will be considered, correct?
There was a problem hiding this comment.
Also a minor comment on L=145-164: I'd have probably used a switch case here. Is there a default case? I'd maybe add an option for that or make sure any exception is properly handled here.
I'd add a comment to explain what pos_norm indicates, something like normalised 2d coordinates on the plane should work (as GNN wants, as @rtriozzi reminded me :)).
Instead in reference to L=209 (and similarly on another module explicitly doing the same check on the number of hits per plane to be above this arbitrary threshold set to 3) I'd ask you to create a const int to host this threshold (not a big fan of hardcoded values) in case it might be revisited in the future. Even better would be to set it "globally" in a different place to have it automatically consistently applied everywhere nugraph needs it, but I understand if this is not feasible at this point.
There was a problem hiding this comment.
Based on the comment at L=213 I think there's an error at L=225:
e.n1 = d.triangles.at(i+1); should be e.n1 = d.triangles.at(I+2);
Please check this and ensure to make the correction.
There was a problem hiding this comment.
L=265, 271, 277: these checks assume that the indexing of space points is consistent with that of hits on the tree planes u, v, y - is that granted?
If I'm getting the logic correctly at L=267 the index corresponding to the hit matching the space point on a certain plane is loaded, I realise it is very complicated but maybe I'd "increase" the explanation at L=167 where the inverse map is created to ensure that what's meant with the key is a bit more clear.
If I'm following, at L=290 the vector named nodeft indicates the normalised (Delaunay?) -input- features, I'd probably ask you to add a comment to make it a bit more clear.
Additional question to follow the logic (related to L=396, but not limited to this point): the output is the classification outcome of nu-graph (after the creation of the input edges in 2d and 3d and the inference)?
On the last part I must confess my ignorance: I tried to follow the logic and checked a bit of details of how Libtorch operates, but I guess I'll trust your experience :)
There was a problem hiding this comment.
Minor comment on L=123: if one empty plane is sufficient for you to skip the slice you can exit the loop over planes whenever one is found.
Agreed. Not sure it will change much as the check in the "if" is simple. I'll leave it to @rtriozzi whether he wants to make the change.
Question: my understanding is this module generalises the previous one (based on uB) to act on more slices simultaneously, but only the slices w/ all the planes filled (which I guess implies a minimum number of hits?) will be considered, correct?
Yes, this is correct. The issue is to avoid crashes, the network (at least the 1D version) assumes that there are hits in each of the plans and crashes if the condition is not respected. This may change in the future.
Also a minor comment on L=145-164: I'd have probably used a switch case here. Is there a default case? I'd maybe add an option for that or make sure any exception is properly handled here.
I don't think there is a default case. A switch case would be probably easier to read but effectively the same as what we have now, so I don't feel strongly. But maybe better to keep what is there now to avoid re-validating?
I'd add a comment to explain what pos_norm indicates, something like normalised 2d coordinates on the plane should work (as GNN wants, as @rtriozzi reminded me :)).
Yes, a comment would be useful. @rtriozzi is it easy enough for you to add it?
Instead in reference to L=209 (and similarly on another module explicitly doing the same check on the number of hits per plane to be above this arbitrary threshold set to 3) I'd ask you to create a const int to host this threshold (not a big fan of hardcoded values) in case it might be revisited in the future. Even better would be to set it "globally" in a different place to have it automatically consistently applied everywhere nugraph needs it, but I understand if this is not feasible at this point.
This is not a tunable cut, but really requiring that there are at least 3 hits (otherwise you cannot make a triangle with Delaunay).
Based on the comment at L=213 I think there's an error at L=225:
e.n1 = d.triangles.at(i+1); should be e.n1 = d.triangles.at(I+2);
Please check this and ensure to make the correction.
Thanks for catching this! @rtriozzi , can you make the change?
L=265, 271, 277: these checks assume that the indexing of space points is consistent with that of hits on the tree planes u, v, y - is that granted?
Yes!
If I'm getting the logic correctly at L=267 the index corresponding to the hit matching the space point on a certain plane is loaded, I realise it is very complicated but maybe I'd "increase" the explanation at L=167 where the inverse map is created to ensure that what's meant with the key is a bit more clear.
Key is a technical term for the index of art pointers. I think it is precise enough, but not opposed to more explanations. @rtriozzi please let me know if you would like more context to add to L167.
If I'm following, at L=290 the vector named nodeft indicates the normalised (Delaunay?) -input- features, I'd probably ask you to add a comment to make it a bit more clear.
Yes, these are the node input features to the GNN. It should be enough to add this as a comment (@rtriozzi )
Additional question to follow the logic (related to L=396, but not limited to this point): the output is the classification outcome of nu-graph (after the creation of the input edges in 2d and 3d and the inference)?
On the last part I must confess my ignorance: I tried to follow the logic and checked a bit of details of how Libtorch operates, but I guess I'll trust your experience :)
Yes, this part is obscure due to the libtorch interface. So there is some conversion from the abstract pytorch objects to the "numbers" we care about. The outputs is on a per hit (node) basis: a single float for filter (background score), and an array of 5 floats for semantic (one per semantic class).
There was a problem hiding this comment.
I think I agree with all the comments.
I'm generally commenting a lot the code, but that's also my taste so feel free to choose the easiest option as far as comments requests are concerned.
I'd do the correction at L=225 of course.
And I guess I have a curiosity on this point:
The issue is to avoid crashes, the network (at least the 1D version) assumes that there are hits in each of the plans and crashes if the condition is not respected. This may change in the future.
do you think this requirement could be removed with the 2d version? if so why?
There was a problem hiding this comment.
L=56 I guess this threshold is based on the quality of the space point created by cluster3d, which is among the inputs nu-graph uses, is this threshold totally arbitrary or set based on previous (I guess also perhaps old) studies made by cluster3d developers or something else? Thanks for the details.
L=86 so here you take the first element just to be safe, correct? Because I'm confused how find1SliceFromHit could have size >1, but this is not nu-graph related of course...
L=96 and similarly other modules please add a const int minNumberOfHits = 3 or something similar and avoid having this number hardcoded here.
There was a problem hiding this comment.
Sorry, that I missed these comments!
L=56 I guess this threshold is based on the quality of the space point created by cluster3d, which is among the inputs nu-graph uses, is this threshold totally arbitrary or set based on previous (I guess also perhaps old) studies made by cluster3d developers or something else? Thanks for the details.
This threshold is based on studies by Eric Novello, see e.g. docdb 38948
L=86 so here you take the first element just to be safe, correct? Because I'm confused how find1SliceFromHit could have size >1, but this is not nu-graph related of course...
I think it has size > 1 because we pass the hit collection, so it's one slice per hit. But hits considered here are all in the same slice, so we only need the first.
L=96 and similarly other modules please add a const int minNumberOfHits = 3 or something similar and avoid having this number hardcoded here.
Yes, that can be done. @rtriozzi is it easy for you to do so? (sorry...)
This PR integrates additional NuGraph2 modules in ICARUS, including some that were already introduced in Giuseppe's icaruscode PR #815, on which this PR is based, as well as new modules designed to:
Along with the modules, a set of FHiCL files is provided. They are all based on the standard
v10_06_00_01p01configurations. Following the review of this PR, similar FHiCL files based ondevelopwill be provided.More information on NuGraph2 and its applications can be found in SBN DocDB #40585, #44208, Indico.
A companion PR to sbncode is available for dealing with CAF-making after the NG2 filter module (sbncode PR #616).
Associated PRs
Review
Tagging for review @acampani and @PetrilloAtWork as reconstruction and infrastructure experts. Thanks a lot!